1) Create Input Training Data & its traning input function
2) Create Input Test Data & its testing input function
3) Choose feature columns and existing Classifier or custom Classifier
4) Train and Test the Model using Classifier
5) Perform Prediction using the model
We need to make sure we have good quality data for the machine learning model to perform accurate prediction, more the traning data better the model prediction outcome will be.
In this example we will use the commonly used iris training data which comes in a simple csv file with 4 features and species as the fifth column
In the below code we are reading the CSV data in a pandas dataframe and splitting the entire data in to two seperate set which will be used for traning and testing seperately
inputdata = pd.read_csv(DATAPATH, names=ALL_CSV_COLUMNS, header=0)
train_x, test_x, train_y, test_y = sk.train_test_split(inputdata.loc[:,FEATURE_COLUMNS], inputdata.loc[:,'Species'],test_size=0.10)
Now input function is used to properly feed the data in batches to the tensorflow classifier
def train_input_fn(features, labels, batch_size):
# Convert the inputs to a Dataset.
dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
# Shuffle, repeat, and batch the examples.
dataset = dataset.shuffle(1000).repeat().batch(batch_size)
# Return the dataset.
return dataset
Now using the split testing data we can now do the same input function for evaluation or testing function, also here we have option for both labeled and non labeled testing dataset
def eval_input_fn(features, labels, batch_size):
features=dict(features)
if labels is None:
# No labels, use only features.
inputs = features
else:
inputs = (features, labels)
# Convert the inputs to a Dataset.
dataset = tf.data.Dataset.from_tensor_slices(inputs)
# Batch the examples
assert batch_size is not None, "batch_size must not be None"
dataset = dataset.batch(batch_size)
# Return the dataset.
return dataset
Now we are selecting the feature column names for the classifier and choose real valued numeric data for the feature columns
my_feature_columns = []
for key in train_x.keys():
my_feature_columns.append(tf.feature_column.numeric_column(key=key))
Choose an appropriate classifier for the given data for best results, below we are using DNNClassifier
# Evaluate the model.
classifier = tf.estimator.DNNClassifier(
feature_columns=my_feature_columns,
hidden_units=[10, 10],
n_classes=3)
Now lets train and test the classifier using the above DNNClassifier and input functions
#Train the Model.
classifier.train(
input_fn=lambda:train_input_fn(train_x, train_y, batch_size),
steps=train_steps)
# Evaluate the model.
eval_result = classifier.evaluate(
input_fn=lambda:eval_input_fn(test_x, test_y, batch_size))
print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))
On the final step, the model is now ready for the prediction. we will use a predetermined set of input data to verify the accuracy of the model
# Generate predictions from the model
expected = ['Setosa', 'Versicolor', 'Virginica']
predict_x = {
'SepalLength': [5.1, 5.9, 6.9],
'SepalWidth': [3.3, 3.0, 3.1],
'PetalLength': [1.7, 4.2, 5.4],
'PetalWidth': [0.5, 1.5, 2.1],
}
predictions = classifier.predict(
input_fn=lambda:eval_input_fn(predict_x, labels=None, batch_size=batch_size))
The outcome of the prediction is returned as a generator object which can be iterated to see our results
for pred_dict, expec in zip(predictions, expected):
template = ('\nPrediction is "{}" ({:.1f}%), expected "{}"')
class_id = pred_dict['class_ids'][0]
probability = pred_dict['probabilities'][class_id]
print(template.format(SPECIES[class_id], 100 * probability, expec))
I hope we were able to explain the tensorflow high level API in a very simple step by step approach and please feel free to contact support@clofus.com for any queries thanks
Just leave your email and our support team will help you